机器学习代写|CS代写|编程代写

CSI5155 - Online Learning


这项任务的目标是在不断变化的数据流中进行在线学习,在Scikit-Multiflow环境中进行。具体来说,我们正在研究概念漂移对算法行为的影响,同时进行预测性评估。

Instruction


You are asked to use three of the Insects data streams in this assignment. The data may be downloaded at https://sites.google.com/view/uspdsrepository (the password is DMKD2018).

This data was obtained from a laser sensor built with low-cost components to remotely capture information about flying insects, in order to aid in intelligent insect trap design. Specifically, we will only use the Insects-Abrupt-Balanced, Insects-Incremental-Balanced, Insects-Incremental-Gradual-Balanced streams in this assignment. Sections 5 to 7 of contain details about the data and the experimentation relevant for this assignment.

1. In Section 7.1 of the reference paper, the authors first consider the no-change and majority class classifiers, with a moving window over a stream of 1000 instances. As a first step, you are asked to conduct these experiments against the three data streams listed above. Following , use prequential accuracy over a sliding window of 1000 to report your results.

2. Next use the following algorithms to construct models against the three data streams: Hoeffding Trees, SAM-KNN, Hoeffding Adaptive Trees as well as two (2) ensemble-based methods of your choice. Again, you should report the prequential accuracies over a sliding window of 1000 instances. 

3. Create figures, similar to figures 22 to 27, to show the prequential accuracies against the three streams, for the learners used in steps 1 and 2. 

4. Next, combine the Hoeffding Tree learner with a drift detection method of your own choice, again using the same setting as the paper in terms of window size (1000). Report the prequential accuracies over a sliding window of 1000 instances.


5. Create figures, similar to figures 28 to 30, to show the prequential accuracies against the three streams, for step 4. 

6. Create a table, similar to table 5, summarizing the prequential accuracies you achieved in steps 1, 2 and 4. 

7. Discuss the results you obtained and the lessons you learned when analysing this data. 

8. Contrast the results you obtained during this assignment with those of the reference paper. Be sure to discuss any differences in methodologies, and results, and to highlight similarities.

References


[1] Souza, V.M.A., dos Reis, D.M., Maletzke, A.G. et al. Challenges in benchmarking stream learning algorithms with real-world data. Section 5, Data Mining and Knowledge Discovery, 34, 1805–1858 (2020). URL: https://link.springer.com/article/10.1007/s10618-020-00698-5


[2] Scikit-Multiflow, URL: https://scikit-multiflow.github.io/

[3] Scikit-Multiflow learning methods, URL: https://scikit-multiflow.readthedocs.io/en/stable/api/api.html#learning-methods

 
[4] Bifet, A., Gavaldà, R., Holmes, G., and Pfahringer, B. Machine Learning with Data Streams with Practical Examples in MOA, 2018: URL: https://moa.cms.waikato.ac.nz/book/



咨询 Alpha 小助手,获取更多课业帮助。